308 research outputs found

    Enhanced Prediction of Three-dimensional Finite Iced Wing Separated Flow Near Stall

    Full text link
    Icing on three-dimensional wings causes severe flow separation near stall. Standard improved delayed detached eddy simulation (IDDES) is unable to correctly predict the separating reattaching flow due to its inability to accurately resolve the Kelvin-Helmholtz instability. In this study, a shear layer adapted subgrid length scale is applied to enhance the IDDES prediction of the flow around a finite NACA (National Advisory Committee for Aeronautics) 0012 wing with leading edge horn ice. It is found that applying the new length scale contributes to a more accurate prediction of the separated shear layer (SSL). The reattachment occurs earlier as one moves towards either end of the wing due to the downwash effect of the wing tip vortex or the influence of end-wall flow. Consequently, the computed surface pressure distributions agree well with the experimental measurements. In contrast, standard IDDES severely elongates surface pressure plateaus. For instantaneous flow, the new length scale helps correctly resolve the rollup and subsequent pairing of vortical structures due to its small values in the initial SSL. The computed Strouhal numbers of vortical motions are approximately 0.2 in the initial SSL based on the vorticity thickness and 0.1 around the reattachment based on the separation bubble length. Both frequencies increase when moving towards the wing tip due to the downwash effect of the tip vortex. In comparison, the excessive eddy viscosity levels from the standard IDDES severely delay the rollup of spanwise structures and give rise to "overcoherent" structures

    Boolean models for genetic regulatory networks

    Get PDF
    This dissertation attempts to answer some of the vital questions involved in the genetic regulatory networks: inference, optimization and robustness of the mathe- matical models. Network inference constitutes one of the central goals of genomic signal processing. When inferring rule-based Boolean models of genetic regulations, the same values of predictor genes can correspond to di®erent values of the target gene because of inconsistencies in the data set. To resolve this issue, a consistency-based inference method is developed to model a probabilistic genetic regulatory network, which consists of a family of Boolean networks, each governed by a set of regulatory functions. The existence of alternative function outputs can be interpreted as the result of random switches between the constituent networks. This model focuses on the global behavior of genetic networks and re°ects the biological determinism and stochasticity. When inferring a network from microarray data, it is often the case that the sample size is not su±ciently large to infer the network fully, such that it is neces- sary to perform model selection through an optimization procedure. To this end, the network connectivity and the physical realization of the regulatory rules should be taken into consideration. Two algorithms are developed for the purpose. One algo- rithm ¯nds the minimal realization of the network constrained by the connectivity, and the other algorithm is mathematically proven to provide the minimally connected network constrained by the minimal realization. Genetic regulatory networks are subject to modeling uncertainties and perturba- tions, which brings the issue of robustness. From the perspective of network stability, robustness is desirable; however, from the perspective of intervention to exert in- °uence on network behavior, it is undesirable. A theory is developed to study the impact of function perturbations in Boolean networks: It ¯nds the exact number of a®ected state transitions and attractors, and predicts the new state transitions and robust/fragile attractors given a speci¯c perturbation. Based on the theory, one algorithm is proposed to structurally alter the network to achieve a more favorable steady-state distribution, and the other is designed to identify function perturbations that have caused changes in the network behavior, respectively

    Quantification of the Impact of Feature Selection on the Variance of Cross-Validation Error Estimation

    Get PDF
    <p/> <p>Given the relatively small number of microarrays typically used in gene-expression-based classification, all of the data must be used to train a classifier and therefore the same training data is used for error estimation. The key issue regarding the quality of an error estimator in the context of small samples is its accuracy, and this is most directly analyzed via the deviation distribution of the estimator, this being the distribution of the difference between the estimated and true errors. Past studies indicate that given a prior set of features, cross-validation does not perform as well in this regard as some other training-data-based error estimators. The purpose of this study is to quantify the degree to which feature selection increases the variation of the deviation distribution in addition to the variation in the absence of feature selection. To this end, we propose the coefficient of relative increase in deviation dispersion (CRIDD), which gives the relative increase in the deviation-distribution variance using feature selection as opposed to using an optimal feature set without feature selection. The contribution of feature selection to the variance of the deviation distribution can be significant, contributing to over half of the variance in many of the cases studied. We consider linear-discriminant analysis, 3-nearest-neighbor, and linear support vector machines for classification; sequential forward selection, sequential forward floating selection, and the <inline-formula><graphic file="1687-4153-2007-16354-i1.gif"/></inline-formula>-test for feature selection; and <inline-formula><graphic file="1687-4153-2007-16354-i2.gif"/></inline-formula>-fold and leave-one-out cross-validation for error estimation. We apply these to three feature-label models and patient data from a breast cancer study. In sum, the cross-validation deviation distribution is significantly flatter when there is feature selection, compared with the case when cross-validation is performed on a given feature set. This is reflected by the observed positive values of the CRIDD, which is defined to quantify the contribution of feature selection towards the deviation variance.</p

    Preface

    Get PDF

    Inference of a Probabilistic Boolean Network from a Single Observed Temporal Sequence

    Get PDF
    The inference of gene regulatory networks is a key issue for genomic signal processing. This paper addresses the inference of probabilistic Boolean networks (PBNs) from observed temporal sequences of network states. Since a PBN is composed of a finite number of Boolean networks, a basic observation is that the characteristics of a single Boolean network without perturbation may be determined by its pairwise transitions. Because the network function is fixed and there are no perturbations, a given state will always be followed by a unique state at the succeeding time point. Thus, a transition counting matrix compiled over a data sequence will be sparse and contain only one entry per line. If the network also has perturbations, with small perturbation probability, then the transition counting matrix would have some insignificant nonzero entries replacing some (or all) of the zeros. If a data sequence is sufficiently long to adequately populate the matrix, then determination of the functions and inputs underlying the model is straightforward. The difficulty comes when the transition counting matrix consists of data derived from more than one Boolean network. We address the PBN inference procedure in several steps: (1) separate the data sequence into "pure" subsequences corresponding to constituent Boolean networks; (2) given a subsequence, infer a Boolean network; and (3) infer the probabilities of perturbation, the probability of there being a switch between constituent Boolean networks, and the selection probabilities governing which network is to be selected given a switch. Capturing the full dynamic behavior of probabilistic Boolean networks, be they binary or multivalued, will require the use of temporal data, and a great deal of it. This should not be surprising given the complexity of the model and the number of parameters, both transitional and static, that must be estimated. In addition to providing an inference algorithm, this paper demonstrates that the data requirement is much smaller if one does not wish to infer the switching, perturbation, and selection probabilities, and that constituent-network connectivity can be discovered with decent accuracy for relatively small time-course sequences

    Uncertainty-Aware Bootstrap Learning for Joint Extraction on Distantly-Supervised Data

    Full text link
    Jointly extracting entity pairs and their relations is challenging when working on distantly-supervised data with ambiguous or noisy labels. To mitigate such impact, we propose uncertainty-aware bootstrap learning, which is motivated by the intuition that the higher uncertainty of an instance, the more likely the model confidence is inconsistent with the ground truths. Specifically, we first explore instance-level data uncertainty to create an initial high-confident examples. Such subset serves as filtering noisy instances and facilitating the model to converge fast at the early stage. During bootstrap learning, we propose self-ensembling as a regularizer to alleviate inter-model uncertainty produced by noisy labels. We further define probability variance of joint tagging probabilities to estimate inner-model parametric uncertainty, which is used to select and build up new reliable training instances for the next iteration. Experimental results on two large datasets reveal that our approach outperforms existing strong baselines and related methods.Comment: ACL 2023 main conference short pape

    Membrane topology analysis of HIV-1 envelope glycoprotein gp41

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The gp41 subunit of the HIV-1 envelope glycoprotein (Env) has been widely regarded as a type I transmembrane protein with a single membrane-spanning domain (MSD). An alternative topology model suggested multiple MSDs. The major discrepancy between the two models is that the cytoplasmic Kennedy sequence in the single MSD model is assigned as the extracellular loop accessible to neutralizing antibodies in the other model. We examined the membrane topology of the gp41 subunit in both prokaryotic and mammalian systems. We attached topological markers to the C-termini of serially truncated gp41. In the prokaryotic system, we utilized a green fluorescent protein (GFP) that is only active in the cytoplasm. The tag protein (HaloTag) and a membrane-impermeable ligand specific to HaloTag was used in the mammalian system.</p> <p>Results</p> <p>In the absence of membrane fusion, both the prokaryotic and mammalian systems (293FT cells) supported the single MSD model. In the presence of membrane fusion in mammalian cells (293CD4 cells), the data obtained seem to support the multiple MSD model. However, the region predicted to be a potential MSD is the highly hydrophilic Kennedy sequence and is least likely to become a MSD based on several algorithms. Further analysis revealed the induction of membrane permeability during membrane fusion, allowing the membrane-impermeable ligand and antibodies to cross the membrane. Therefore, we cannot completely rule out the possible artifacts. Addition of membrane fusion inhibitors or alterations of the MSD sequence decreased the induction of membrane permeability.</p> <p>Conclusions</p> <p>It is likely that a single MSD model for HIV-1 gp41 holds true even in the presence of membrane fusion. The degree of the augmentation of membrane permeability we observed was dependent on the membrane fusion and sequence of the MSD.</p

    GraphMAE2: A Decoding-Enhanced Masked Self-Supervised Graph Learner

    Full text link
    Graph self-supervised learning (SSL), including contrastive and generative approaches, offers great potential to address the fundamental challenge of label scarcity in real-world graph data. Among both sets of graph SSL techniques, the masked graph autoencoders (e.g., GraphMAE)--one type of generative method--have recently produced promising results. The idea behind this is to reconstruct the node features (or structures)--that are randomly masked from the input--with the autoencoder architecture. However, the performance of masked feature reconstruction naturally relies on the discriminability of the input features and is usually vulnerable to disturbance in the features. In this paper, we present a masked self-supervised learning framework GraphMAE2 with the goal of overcoming this issue. The idea is to impose regularization on feature reconstruction for graph SSL. Specifically, we design the strategies of multi-view random re-mask decoding and latent representation prediction to regularize the feature reconstruction. The multi-view random re-mask decoding is to introduce randomness into reconstruction in the feature space, while the latent representation prediction is to enforce the reconstruction in the embedding space. Extensive experiments show that GraphMAE2 can consistently generate top results on various public datasets, including at least 2.45% improvements over state-of-the-art baselines on ogbn-Papers100M with 111M nodes and 1.6B edges.Comment: Accepted to WWW'2
    corecore